NYC Payroll Policy Analysis

Author

Shubh Goyal

Published

Invalid Date

1 Introduction

This report analyzes NYC payroll data to evaluate potential policies aimed at reducing taxpayer expenses. We examine three policies:

1. Capping salaries at the mayoral level

2. Increasing staffing to reduce overtime

3. A custom policy proposal

The analysis is based on historical payroll data, and all calculations are made using standardized assumptions.

2 Load Data and Libraries

2.1 Data Preparation

Show Code 💡
#|message: false
#|warning: false
library(readr)
library(stringr)
library(ggplot2)
library(tidyverse)
library(lubridate)
library(scales) 
library(DT)

2.2 Load Data

Show Code 💡
#|message: false
#|warning: false
payroll_data <- read_csv("C:/Users/goyal/OneDrive/Documents/MP01/data/mp01/nyc_payroll_export.csv")

2.3 Inspect Data

Show Code 💡
payroll_data <- payroll_data %>%
mutate(
agency_name = str_to_title(agency_name),
last_name = str_to_title(last_name),
first_name = str_to_title(first_name),
work_location_borough = str_to_title(work_location_borough),
title_description = str_to_title(title_description),
leave_status = str_to_title(leave_status_as_of_june_30)
)
Show Code 💡
payroll_data <- payroll_data %>% 
mutate(base_salary = as.numeric(base_salary), 
reg_hours = as.numeric(regular_hours), 
ot_hours = as.numeric(ot_hours))
glimpse(payroll_data)
Rows: 6,225,611
Columns: 19
$ fiscal_year                <dbl> 2024, 2024, 2024, 2024, 2024, 2024, 2024, 2…
$ payroll_number             <dbl> 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67,…
$ agency_name                <chr> "Admin For Children's Svcs", "Admin For Chi…
$ last_name                  <chr> "Faye Fall", "Kilgore", "Wisdom", "Miller",…
$ first_name                 <chr> "Sokhna", "Orlantha", "Cherise", "Moya-Gaye…
$ mid_init                   <chr> "M", "B", "M", "S", "M", "L", "O", NA, "N",…
$ agency_start_date          <dttm> 2023-11-20, 2023-08-28, 2022-10-24, 2023-0…
$ work_location_borough      <chr> "Bronx", "Brooklyn", "Manhattan", "Manhatta…
$ title_description          <chr> "Child Protective Specialist", "Child Prote…
$ leave_status_as_of_june_30 <chr> "ACTIVE", "ACTIVE", "ON LEAVE", "ON LEAVE",…
$ base_salary                <dbl> 62043, 62043, 43144, 62043, 60236, 62043, 6…
$ pay_basis                  <chr> "per Annum", "per Annum", "per Annum", "per…
$ regular_hours              <dbl> 1050.00, 1470.00, 1251.50, 1400.75, 700.00,…
$ regular_gross_paid         <dbl> 31267.96, 44660.96, 28649.20, 44515.43, 221…
$ ot_hours                   <dbl> 12.00, 99.75, 30.00, 44.75, 53.00, 146.00, …
$ total_ot_paid              <dbl> 425.00, 3859.84, 802.42, 1476.98, 1933.33, …
$ total_other_pay            <dbl> 78.04, 78.14, 78.26, 78.37, 78.47, 78.86, 7…
$ leave_status               <chr> "Active", "Active", "On Leave", "On Leave",…
$ reg_hours                  <dbl> 1050.00, 1470.00, 1251.50, 1400.75, 700.00,…

3 Mayor Eric Adams Payroll Analysis

3.1 Mayor Eric Adams’s Salary

Show Code 💡
mayor_data <- payroll_data %>%
filter(str_detect( first_name, "Eric") & str_detect( last_name , "Adams")) %>%
select( fiscal_year , title_description , agency_name , base_salary ) %>%
  arrange( fiscal_year )
mayor_data %>%
mutate( base_salary = dollar( base_salary )) %>%
datatable(options = list( searching = FALSE, paging = FALSE, info = FALSE))

3.2 Calculation of Total Compensation

Show Code 💡
payroll_data <- payroll_data %>%
mutate (total_compensation = case_when(
pay_basis == "per Annum" ~ base_salary,
pay_basis == "per Hour" ~ base_salary * regular_hours + (base_salary * 1.5 * ot_hours),
pay_basis == "per Day" ~ base_salary * (regular_hours / 7.5),TRUE ~ base_salary ) )

datatable(payroll_data %>%
select(first_name, last_name, agency_name, title_description, pay_basis, base_salary, regular_hours, ot_hours, total_compensation) %>%
arrange(desc(total_compensation)) %>% 
slice_head(n = 10),options = list(scrollX = TRUE))

4 Key Payroll Analysis

This section answers critical payroll-related questions using NYC Payroll Data.

  1. Which job title has the highest base rate of pay? (If needed, assume a standard 2000-hour work year and no overtime.)
Show Code 💡
highest_base_rate <- payroll_data %>%
mutate(hourly_rate = base_salary / 2000) %>%
arrange(desc(hourly_rate)) %>%
select(title_description, agency_name, hourly_rate) %>% slice(1)
highest_base_rate
# A tibble: 1 × 3
  title_description agency_name           hourly_rate
  <chr>             <chr>                       <dbl>
1 Chair             Nyc Housing Authority        207.
  1. Which individual & in what year had the single highest city total payroll (regular and overtime combined)?
Show Code 💡
highest_earning_employee <- payroll_data %>%
mutate(total_compensation = base_salary + total_ot_paid + total_other_pay) %>%
arrange(desc(total_compensation)) %>%
select(fiscal_year, first_name, last_name, title_description, agency_name, total_compensation) %>%
slice(1)
print(highest_earning_employee)
# A tibble: 1 × 6
  fiscal_year first_name last_name title_description     agency_name            
        <dbl> <chr>      <chr>     <chr>                 <chr>                  
1        2024 Mark       Tettonis  Chief Marine Engineer Department Of Transpor…
# ℹ 1 more variable: total_compensation <dbl>
  1. Which individual worked the most overtime hours in this data set?
Show Code 💡
most_overtime_employee <- payroll_data %>%
arrange(desc(ot_hours)) %>%
select(fiscal_year, first_name, last_name, title_description, agency_name, ot_hours) %>%
slice(1)
print(most_overtime_employee)
# A tibble: 1 × 6
  fiscal_year first_name last_name   title_description  agency_name     ot_hours
        <dbl> <chr>      <chr>       <chr>              <chr>              <dbl>
1        2022 James      Internicola Correction Officer Department Of …    3693.
  1. Which agency has the highest average total annual payroll (base and overtime pay per employee)?
Show Code 💡
highest_avg_payroll_agency <- payroll_data %>%
group_by(agency_name) %>%
summarize(avg_total_pay = mean(base_salary + total_ot_paid + total_other_pay, na.rm = TRUE)) %>%
arrange(desc(avg_total_pay)) %>% slice(1)
highest_avg_payroll_agency
# A tibble: 1 × 2
  agency_name             avg_total_pay
  <chr>                           <dbl>
1 Office Of Racial Equity       153102.
  1. Which agency has the most employees on payroll in each year?
Show Code 💡
most_employees_per_year <- payroll_data %>%
group_by(fiscal_year, agency_name) %>%
summarize(employee_count = n()) %>%
arrange(fiscal_year, desc(employee_count)) %>%
group_by(fiscal_year) %>% slice(1)
most_employees_per_year
# A tibble: 11 × 3
# Groups:   fiscal_year [11]
   fiscal_year agency_name            employee_count
         <dbl> <chr>                           <int>
 1        2014 Dept Of Ed Pedagogical         100589
 2        2015 Dept Of Ed Pedagogical         111857
 3        2016 Dept Of Ed Pedagogical         106263
 4        2017 Dept Of Ed Pedagogical         104629
 5        2018 Dept Of Ed Pedagogical         107956
 6        2019 Dept Of Ed Pedagogical         112067
 7        2020 Dept Of Ed Pedagogical         114999
 8        2021 Dept Of Ed Pedagogical         113523
 9        2022 Dept Of Ed Pedagogical         120453
10        2023 Dept Of Ed Pedagogical         106882
11        2024 Dept Of Ed Pedagogical         108209
  1. Which agency has the highest overtime usage (compared to regular hours)?
Show Code 💡
highest_overtime_usage <- payroll_data %>%
mutate(overtime_ratio = ot_hours / (2000 + ot_hours)) %>%
group_by(agency_name) %>%
summarize(avg_overtime_ratio = mean(overtime_ratio, na.rm = TRUE)) %>%
arrange(desc(avg_overtime_ratio)) %>% slice(1)
highest_overtime_usage
# A tibble: 1 × 2
  agency_name     avg_overtime_ratio
  <chr>                        <dbl>
1 Fire Department              0.135
  1. What is the average salary of employees who work outside the five boroughs? (That is, whose work_location_borough is not one of the five counties.)
Show Code 💡
avg_salary_outside_nyc <- payroll_data %>%
filter(!work_location_borough %in% c("Manhattan", "Brooklyn", "Queens", "Bronx", "Staten Island")) %>%
summarize(avg_salary = mean(base_salary, na.rm = TRUE))
avg_salary_outside_nyc
# A tibble: 1 × 1
  avg_salary
       <dbl>
1     53730.
  1. How much has the city’s aggregate payroll grown over the past 10 years?
Show Code 💡
payroll_growth <- payroll_data %>%
group_by(fiscal_year) %>% summarize(total_payroll = sum(base_salary + total_ot_paid + total_other_pay, na.rm = TRUE)) %>%
arrange(fiscal_year) %>%
mutate(payroll_growth = (total_payroll / lag(total_payroll) - 1) * 100,
payroll_growth = paste0(round(payroll_growth, 2), "%"))
payroll_growth
# A tibble: 11 × 3
   fiscal_year total_payroll payroll_growth
         <dbl>         <dbl> <chr>         
 1        2014  22638474550. NA%           
 2        2015  25470285473. 12.51%        
 3        2016  26510880597. 4.09%         
 4        2017  27208527853. 2.63%         
 5        2018  27162686275. -0.17%        
 6        2019  29546368177. 8.78%         
 7        2020  32146285486. 8.8%          
 8        2021  31552393078. -1.85%        
 9        2022  34872407402. 10.52%        
10        2023  33319364876. -4.45%        
11        2024  34700020886. 4.14%         

5 Policy Analysis

5.1 Policy 1: Capping Salaries at Mayoral Level

5.1.1 Compute the Mayor’s salary per year

Show Code 💡
mayor_salary <- payroll_data %>%
  filter(title_description == "Mayor") %>%
  select(fiscal_year, total_compensation)
datatable(mayor_salary, options = list(scrollX = TRUE))

5.1.2 Identify High Earners

Show Code 💡
high_salaries <- payroll_data %>%
inner_join(mayor_salary, by = "fiscal_year", suffix = c("_emp", "_mayor")) %>%
filter(total_compensation_emp > total_compensation_mayor)

5.1.3 Total Savings

Show Code 💡
total_savings <- sum(high_salaries$total_compensation_emp - high_salaries$total_compensation_mayor, na.rm = TRUE)

datatable(
  high_salaries %>%
    select(fiscal_year, first_name, last_name, title_description, agency_name, total_compensation_emp, total_compensation_mayor) %>%
    slice_head(n = 100),  
  options = list(scrollX = TRUE)
)

5.2 Policy 2: Increasing Staffing to Reduce Overtime Expenses

5.2.1 Calculate total overtime

Show Code 💡
overtime_reduction <- payroll_data %>%
group_by(agency_name, title_description) %>%
summarize( total_overtime_hours = sum(ot_hours, na.rm = TRUE), full_time_equivalent_needed = total_overtime_hours / 2000 ) %>% arrange(desc(total_overtime_hours))
datatable(overtime_reduction, options = list(scrollX = TRUE))

5.2.2 Total Potential Savings

Show Code 💡
overtime_savings <- payroll_data %>%
group_by(agency_name, title_description) %>%
summarize( overtime_cost = sum(1.5 * base_salary * ot_hours, na.rm = TRUE),
regular_cost = sum(base_salary * (ot_hours / 40), na.rm = TRUE),
potential_savings = overtime_cost - regular_cost ) %>%
arrange(desc(potential_savings))
datatable(overtime_savings, options = list(scrollX = TRUE))

5.2.3 Total Agency Savings

Show Code 💡
agency_savings <- overtime_savings %>%
group_by(agency_name) %>%
summarize( total_overtime_cost = sum(overtime_cost, na.rm = TRUE),
total_regular_cost = sum(regular_cost, na.rm = TRUE),
total_savings = sum(potential_savings, na.rm = TRUE) ) %>%
arrange(desc(total_savings))
datatable(agency_savings, options = list(scrollX = TRUE))

5.3 Policy 3: Overtime Reduction Through Strategy Hiring

5.3.1 Overview

Many NYC agencies rely heavily on overtime (OT) to compensate for staffing shortages, significantly inflating payroll costs. Instead of paying excessive overtime wages, the city could hire additional employees to reduce OT dependency.

Analysis Steps:

  1. Identify High-Overtime Employees
  2. Compare Costs
  3. Potential Savings
  4. Recommendation

5.3.2 Identify High-Overtime Employees

Find employees with excessive overtime hours (e.g., more than 500 OT hours per year).

Show Code 💡
high_overtime_employees <- payroll_data %>%
filter(ot_hours > 500) %>%
group_by(title_description, agency_name) %>%
summarize( avg_overtime_hours = mean(ot_hours, na.rm = TRUE),
total_overtime_pay = sum(total_ot_paid, na.rm = TRUE), num_high_overtime_employees = n(),
avg_base_salary = mean(base_salary, na.rm = TRUE) ) %>%
arrange(desc(total_overtime_pay))
datatable(high_overtime_employees, options = list(scrollX = TRUE))

5.3.3 Compare Costs of Overtime vs. Hiring

Assuming a standard 2,000-hour work year

Show Code 💡
estimated_new_hires <- high_overtime_employees %>%
mutate( equivalent_full_time_positions = round(avg_overtime_hours / 2000, 1), total_new_hire_salary = equivalent_full_time_positions * avg_base_salary ) %>%
select(agency_name, title_description, num_high_overtime_employees, equivalent_full_time_positions, total_overtime_pay, total_new_hire_salary)
datatable(estimated_new_hires, options = list(scrollX = TRUE))

5.3.4 Calculate Potential Savings

Compare the total overtime cost vs. the cost of hiring new employees

Show Code 💡
potential_savings <- estimated_new_hires %>%
mutate(savings = total_overtime_pay - total_new_hire_salary) %>%
arrange(desc(savings))
datatable(potential_savings, options = list(scrollX = TRUE))

5.3.5 Identify Agencies with Highest OT Dependency

Find agencies where overtime reliance is highest

Show Code 💡
library(dplyr)

overtime_by_agency <- payroll_data %>%
  group_by(agency_name) %>%
  summarize(
    total_overtime_pay = sum(total_ot_paid, na.rm = TRUE),
    total_base_salary = sum(base_salary, na.rm = TRUE),
    ot_to_salary_ratio = total_overtime_pay / total_base_salary  # Corrected calculation
  ) %>%
  arrange(desc(ot_to_salary_ratio))

5.3.6 Recommendations

  • Agencies with the highest OT-to-salary ratios should prioritize hiring over paying OT.
  • Job roles with excessive OT hours (e.g., NYPD, FDNY, sanitation workers) should be assessed for staffing shortages.
  • The city could cap annual OT hours per employee and use the savings to hire additional workers

6 Conclusion & Final Recommendation

Each policy presents unique benefits and trade-offs. Based on financial impact and feasibility:

Best Policy Choice: Policy III (Strategic Overtime Hiring).

Secondary Option: Policy II (Broad Hiring to Cut Overtime).

Least Effective: Policy I (Salary Cap), due to talent retention concerns.

Implement Policy 3 with phased hiring & overtime monitoring ( Final Recommendation )